LP Decodable Permutation Codes based on Linearly 
Constrained Permutation Matrices 



A set of linearly constrained permutation matrices are proposed for constructing a class of permutation codes. Making use of 
linear constraints imposed on the permutation matrices, we can formulate a minimum Euclidian distance decoding problem for 
the proposed class of permutation codes as a linear programming (LP) problem. The main feature of this class of permutation 
codes, called LP decodable permutation codes, is this LP decodability. It is demonstrated that the LP decoding performance of 
the proposed class of permutation codes is characterized by the vertices of the code polytope of the code. Two types of linear 
constraints are discussed; one is structured constraints and another is random constraints. The structured constraints such as pure 
involution lead to an efficient encoding algorithm. On the other hand, the random constraints enable us to use probabilistic methods 
for analyzing several code properties such as the average cardinality and the average weight distribution. 

Index Terms: permutation codes, linear programming, polytope, decoding, error correction 



The class of linear codes defined over a finite field is ubiquitously employed in digital equipments for achieving reliable 
communication and storage systems. For example, the class of codes includes practically important codes such as Reed-Solomon 
codes, BCH codes, and LDPC codes. The linearity of codes enables us to use efficient encoding and decoding algorithms based 
on their linear algebraic properties. 

On the other hand, there are some classes of nonlinear codes which are interesting from both theoretical and practical points 
of view. The class of permutation codes is such a class of nonlinear codes. 

The origin of permutation codes dates back to the 1960s. Slepian [17] proposed a class of simple permutation codes, which 
is referred to as permutation modulation, and efficient soft decoding algorithms for these codes. The variant I code [17] is 
obtained by applying all the permutations to the initial vector 



where Hi is a real value and n = rii + • • • + n^,. This research has been extended and investigated by a number of researchers. 
Biglieri and Elia [19], Karlof [18], Ingemarsson [20] studied optimization of the initial vector of the permutation modulation. 
Berger et al. [21] discussed applications of permutation codes to source coding problems. 

There is another thread of researches on a class of permutation codes of length n whose codeword contains exactly n-distinct 
symbols; i.e., any codeword can be obtained by applying a permutation to an initial vector, e.g., (0, 1, . . . , n — 1). 

Some fundamental properties of such permutation codes were discussed in Blake et al. [1], and Frankl and Deza [8]. Vinck 
[13] [14] proposed applications of permutation codes for power-line communication and this triggered subsequent works on 
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permutation codes. Wadayama and Vinck [16] presented a multi-level construction of permutation codes with large minimum 
Hamming distance. A number of constructions for permutation codes have been developed, including the construction given 
in [4] [6]. Especially, the idea of a distance-preserving map due to Vinck and Ferreira [15] had influence on the study of 
permutation codes such as subsequent works by Chang et al. [2] [3]. 

Recently, rank modulation codes for flash memory proposed by Jiang et al. [9] [10] generated renewed interest in permutation 
codes. For example, for flash memory coding, Kl0ve et al. gave a new construction for permutation codes based on Chebyshev 
Distance [11], which is an appropriate distance measure for flash memory coding. Barg and Mazumdar [24] also studied some 
fundamental bounds on permutation codes in terms of the Kendall tau distance. 

In order to employ a permutation code in a practical application, efficient encoding and soft-decoding algorithms are crucial to 
achieve reUable communication over noisy channels, such as an AWGN channel. Nonlinearity of permutation codes prevents 
the use of conventional encoding and decoding techniques based on linear algebraic properties. Although much works on 
permutation codes have been conducted, an aspect of efficient soft-decoding has not been intensively discussed so far. Therefore, 
there is still room for further researches on permutation codes with efficient encoding and soft-decoding algorithms. 

In this paper, a new class of permutation codes called LP decodable permutation codes is introduced. An LP decodable 
permutation code is obtained by applying permutation matrices satisfying certain linear constraints to an rt-dimensional real 
initial vector. 

It is well known that permutation matrices are vertices of the Birkhoff polytope [35], which is the set of doubly stochastic 
matrices. Thus, a set of linearly constrained permutation matrices can be expressed by a set of linear equalities and linear 
inequalities. This property leads to the main feature of this class of permutation codes: LP-decodable property. For this class of 
codes, a decoding problem can be formulated as a linear programming (LP) problem. This means that we can exploit efficient 
LP solvers based on simplex methods or interior point methods to decode LP decodable permutation codes. 

Furthermore, for a combination of this class of codes and its LP decoding, the maximum likelihood (ML) certificate property 
can be proved as in the case of the LP decoding for LDPC codes [7]. This is due to the fact that the LP problem given in this 
paper is a relaxed problem of an ML decoding problem. 

In general, a fundamental polytope [27] [7] used for LP decoding of LDPC codes contains a number of fractional vertices, 
which are a major source of sub-optimality of LP decoding. The constraints corresponding to an LDPC matrix are defined 
based on F2-arithmetics. On the other hand, an LP decoder works on the real number field. This domain mismatch produces 
many undesirable fractional vertices on the fundamental polytope. One motivation of the present study is to establish a coding 
scheme without this mismatch. In other words, the LP decodable permutation codes are defined on the real number field and 
are decoded using an LP solver working on the real number field. 

The organization of the paper is as follows. Section|ll]introduces some definitions and notation required for discussion. Section 



III gives the definition of the LP decodable permutation codes and its decoding algorithm. Section IV provides analysis for 



decoding performance of LP decoding and ML decoding. Section [V] presents some classes of permutation codes which are 



easy to encode. Section VI offers probabilistic analysis on the cardinality and weight distribution of random LP decodable 



permutation codes. Section VII gives a concluding summary. 
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II. Preliminaries 

A. Notation and definition 

In this paper, matrices are represented by capital letters and a vector is assumed to be a column vector. Let X he an n x n 
real matrix. The notation X > means that every element in X is non-negative. The notation vec(X) represents a vectorization 
of X given by 

vec(X) — {Xi^i ■ ■ ■ Xi^n ^2^1 ■ ■ ■ ^2,n, -'^^3,1 ' ' ' ^n.n)"'" • 

The vector 1 is the all-one vector whose length is determined by the context. The norm 1 1 • 1 1 denotes the Euclidean norm 
given by ||a;|| = (x'^xY^^. The trace function trace(X) returns the sum of the diagonal elements of X. The sets ]R,Z are the 
sets of real numbers and integers, respectively. The set [a, P] denotes the set of consecutive integers from a G Z to /3 e Z. 
The symbol < means 

( a,^ ( 

where <i is either = or <. For simplicity, the notation < = (<]i, ^2, ■ ■ ■ , 53m)^ is used to define < (e.g., < = (<, =, <)"^). 



( «^ ^ 








<] 
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The next definition gives a class of matrices of crucial importance in this paper 
Definition 1 (Perm 
matrix if and only if 



Definition 1 (Permutation matrix): An nxn binary real matrix X = (-'^ijOi je[i.ri] ^ {0,1}"^" is called a permutation 



i'6[l,n] i'G[l,ra] 

□ 

The set of n x n permutation matrices is denoted by n„. The cardinality of n„ is n\. 

Removing the binary constraint from the definition of the permutation matrices, we have the definition of doubly stochastic 
matrices. 

Definition 2 (Doubly stochastic matrix): An nxn non-negative real matrix X =^ (^i j)i je[i,n] is called a doubly stochastic 
matrix if and only if ([TJi holds. q 
The following theorem for a double stochastic matrix implies that the set of doubly stochastic matrices is a convex polytope. 

Theorem 1 (Birkhoff-von Neumann theorem [35] [36] ): Every doubly stochastic matrix is a convex combination of permu- 
tation matrices. 

The set of n x n doubly stochastic matrices is a polytope called the Birkhoff polytope J5„ [35], which is also known as 
perfect matching polytope. The Birkhoff polytope is a (n — 1) ^-dimensional convex polytope with n!-vertices and 71^-facets 
[34]. The Birkhoff-von Neumann theorem implies that any vertex (i.e., extreme point) of the Birkhoff polytope is a permutation 
matrix and vice versa. 

B. LP decoding for permutation vectors 

Assume that s e M", called the initial vector, is giverQ The set of images of s by left action of X € n„ is called the 
permutation vectors of s, which is given by 

A(,s) ^{Xs\X& n„}. (2) 

'The elements in s are not necessarily distinct each other. 
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For example, if s = (0, 1, 2)-^, then A(s) is given by 

A(s) = {(0, 1, 2), (0, 2, 1), (1, 0, 2), (1, 2, 0), (2, 0, 1)(2, 1, 0)}. 

We here consider a situation such that a vector of A(s) is transmitted to a receiver over an AWGN channel. In such a case, 
it is desirable to use an ML decoding algorithm to estimate the transmitted vector. The ML decoding rule can be describe as 

X = arg min \\y - (3) 

xeA(s) 

where y is a received word. 

The next theorem states that the ML decoding for A(s) can be formulated as the following LP problem. 

Theorem 2 (LP decoding and ML certificate property): Assume that a vector in A(s) is transmitted over an AWGN channel 
and that y E M" is received on the receiver side. We also suppose that x = argmina;gA(s) ||y — is uniquely determined 
from y. Let X* be the solution of the following LP problem: 

maximize trace(C^X) 
subject to 

X e M"""" 
ATI = 1 
l^X = 1^ 

X > 0, (4) 

where C = ys'^ . If X* is integral, x — X*s holds. 

Proof: The linear constraints in the above LP problem implies that X is constrained to be a doubly stochastic matrix. 
On the other hand, the ML decoding rule can be recast as follows: 

X = arg min — x|p 

— (arg min IIh — Ars|p)s 

xen„ 

- (arg min i\\y\\' - 2y'^{Xs) + \\Xsf))s 



(arg max y Xs)s — (arg max trace(C X))s, 



where C = ys^ . Note that 



n n 

i=i i=i 

Since the vertices of the Birkhoff polytope is a permutation matrix, the ML decoding can be formulated as an integer LP (ILP) 
problem: 

maximize trace(C"^X) 

subject to A" e _B„, X is an integral matrix. 

By removing the integral constraint {X is an integral matrix), we obtain the LP problem Q. If the solution of this LP problem 
is integral, it must coincide with the solution of the above ILP problem. ■ 
As we have seen, the feasible set of the above LP problem is the Birkhoff polytope. Thus, an output of the above LP is 
highly likely integral. 
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The following example illustrates an LP decoding procedure. 
Example 1: Let s = (0, 1)^. In this c 
y = (0.9, 0.2)^ is received. In this case, 



Example 1: Let s = (0, 1)^. In this case, the set of permutation vectors becomes A(s) = {(0, 1)^, (1,0)^}. Assume that 



„ , 0.9 \ ^ ^ / 0.9 
C = ys'^=\ (01) = 

0.2 / \ 0.2 



is obtained. By letting 



we have the objective function 



X = 



X2A X2,2 



trace I I ° ° | f I I =0.9X1,2 + 0.2X2,2. 

0.9 0.2 / \ ^2,1 ^2,2 



As a result, the LP decoding problem is given by 



maximize 0.9Xi 2 + 0.2X2,2 subject to 

Xi,i + Xi,2 = 1, X2,l + X2,2 = 1, 
Xi,i + X2,i = 1, Xi,2 + X2,2 = 1 
^I,l5-^1,2,X2,1,X2,2 > 0. 



The solution of the problem is 



X* = 



1 

1 



and then we have the estimated word X*s = (1, 0)-^. □ 

III. Linearly constrained permutation matrices and LP decodable permutation codes 

It is natural to consider an extension of the LP decoding presented in the previous section. Additional linear constraints 
imposed on n„ produce a restricted set of A(s). A decoding problem of such a set can be formulated as an LP problem, as 
in the case of the ML decoding of A(s). 



A. Definitions 

The next definition for linearly constrained permutations gives an LP-decodable subset of A(s). 
Definition 3 (linearly constrained permutation matrix): Let m,n be positive integers. Assume that A e Z"*^"^, b G 
and < e {=, <}"* are given. A set of linearly constrained permutation matrices is defined by 

Ii{A, 6, <) = {X e n„ I A vec(X) < b}. (6) 

□ 

Note that A vec(X) < b formally represents additional m equahties and inequaUties. These additional constraints provide a 
restriction on permutation matrices. 

From the hnearly constrained permutation matrices, LP decodable permutation codes are naturally defined as follows. 
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Definition 4 (LP decodable permutation code): Assume the same set up as in Definition [5] Suppose also that s E M" is 
given. The set of vectors A{A, b,<,s) given by 

A{A, b, <, s) = {Xs e M" I X e n(A, 6, <)} (7) 

is called an LP decodable permutation code. q 
If =^ X(i)s 7^ X(2)s holds for any X^^\ X'^^l {X^^^ ^ X^^)) g U{A,b,^), then an LP decodable permutation code is said 
to be non-singlar. Namely, there is one-to-one correspondence between permutation matrices in n(yl, 6, <) and codewords of 
A(yl, 6, <, s) if a code is non-singular. Note that a code may become singular if identical symbols exist in s. 

The next example shows a case where an additional linear constraint imposes a restriction on permutation matrices. 
Example 2: Consider the set of linearly constrained permutation matrices which consists of 4 x 4 permutation matrices 
satisfying the linear constraint trace(X) = 0. The constraint implies that the diagonal elements of the permutation matrices 
are constrained to be zero. This means that such permutation matrices correspond to permutations without fixed points, which 
are called derangements. For n — 4, there are 9-derangement permutation matrices as follows: 
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In this case, the triple [A, b,<) is defined by 

A = vec(/), 6 = 0, < = (=), (8) 

where / is the 4x4 identity matrix. Multiplying these matrices to the initial vector s — (0, 1, 2, 3)"^ from left, we immediately 
obtain the members of A{A, b, <, (0, 1, 2, 3)^): 

(1,0,3,2)^, (1,2, 3, Of, (1,3,0,2)^, 

(2,0,3,1^, (2, 3,0, If, (2, 3,1, Of, (9) 
(3,0,1,2)^, (3,2,0,1)^, (3,2,1,0)^. 

This code is thus non-singular If the initial vector is 

s = (0,0,0,Of , 

then the resulting code has the only codeword (0,0,0,0). In this case, the code becomes singular. q 
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B. LP decoding for LP decodable permutation codes 

The LP decoding of A(A, 6, <, s) is a natural extension of the LP decoding for A(s). Assume that a vector in A(A, 6, <, s) 
is transmitted over an AWGN channel and y G M" is given. The procedure for the LP decoding of A(^, 6, <, s) is given as 
follows. 



LP decoding for an LP decodable permutation code 



1) Solve the following LP problem and let X* be the solution. 

maximize trace(C"^A") 



subject to 

X e M"^", 

a: > 0, 

XI = 1, 

l^X = 1^, 

A\jec{X) < 6, (10) 



where C — ys^ . 

2) Output X* s if X* is integral. Otherwise, declare decoding failure. 



C. Remarks 

Several remarks should be made regarding the LP decoding for A(v4, 6, <, s)- 



The feasible set of (10 1 is a subset of the feasible set of (Wl). All the matrices in 11(^,6, <) are feasible and permutation 



matrices which do not belong to n(yl, 6, <) are infeasible. This implies that all the integral points of the feasible set (10 1 
coincide with n(A, 6, <). 

The LP problem ( [T0| is a relaxed problem of the ML decoding problem over AWGN channels: 

minimize \\y — x\\^ subject to a; G A(A, 6, <, s). (11) 

This can be easily shown, as in the case (|4|l. As a consequence of the above properties on integral points and on the relaxation, 
it can be concluded that the LP decoding for K{A, b,<,s) has the ML-certificate property as well. Namely, if the output of LP 
decoding is not decoding failure (i.e., X* is integral), the output is exactly the same as the solution of the minimum distance 
decoding problem ( fTT) . Note that the LP decoding presented above becomes the ML decoding if the code polytope is integral. 



The feasible set of the LP problem (10 1 is the intersection of the Birkhoff polytope and a (possibly unbounded) convex 
set defined by the additional constraints. The intersection becomes a polytope which is called a code polytope. The decoding 
performance of LP decoding is closely related to the code polytope given by the following definition. 

Definition 5 (Code polytope): The polytope 'P{A,b,<) defined by 

V{A, b, <) = B„ n {a: e M"''" | A vec{X) < b} (12) 
is called the code polytope for 11(^4, b, <), where B„ is the Birkhoff polytope corresponding to n„. q 
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A vec(X) < b 




O 



Fractional 
vertex 



Integral vertex 



Birkhoff polytope 



Fig. 1. Code polytope ViA, b, <) 



Figure [T| illustrates a code polytope. It should be remarked that the set of integral vertices of the code polytope coincides 



with n(A, b, <). Due to additional linear constraints A vec(X) < 6, a code polytope may have some fractional vertices, which 
contain components of fractional number. 

In an LP decoding process, these fractional vertices become possible candidates of an LP solution. Thus, these fractional 
vertices can be considered as pseudo permutation matrices which degrade the decoding performance of the LP decoding. 



In this section, upper bounds on decoding error probability for LP decoding and ML decoding are presented. 

A. Upper bound on LP decoding error probability 

An advantage of the LP formulation of a decoding algorithm is its simplicity for detailed decoding performance analysis. 
The geometrical properties of a code polytope is closely related to its decoding performance of the LP decoding. We can 
evaluate the block error probability of the proposed scheme with reasonable accuracy if we have enough information on the 
set of vertices of a code polytope. The bound presented in this section has close relationship to the pseudo codeword analysis 
on LDPC codes [5]. 

In this section, a set of parameters v4, fe, <,s are assumed to be given. Let V be the set of vertices of the code polytope 
"Pi^jb, <,s). In general, V contains fractional vertices. 

The next lemma gives bridge between a code polytope and corresponding decoding error probability. 

Lemma 1 (Upper bound on block error rate for LPD): Assume that a codeword Xs is transmitted to a receiver via an 
AWGN channel, where X G 11(^,6, <). The additive white Gaussian noise with mean and variance cr^ is assumed. The 
receiver uses the LP decoding algorithm presented in the previous section. In this case, the block error probability P]^p{X) is 
upper bounded by 



IV. Analysis for decoding performance of LP decoding and ML decoding 




(13) 
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where the Q-function is the tail probability of the normal Gaussian distribution, which is given by 

1 

\/2^ 



Q{x)= / -=exp -- Idt. (14) 



Proof: Let y = Xs + z, where z is an additive white Gaussian noise term. We first consider the pairwise block error 
probability Pe{X, X) between X and X E ll{A, b, <), which is given by 

Pe{X,X)=Prob[y'^Xs>y'^Xs]. (15) 

Namely, Pe(X,X) is the probability such that Xs is more likely than Xs for a given y under the assumption that only X 
and X are allowable permutation matrices. 

The difference y'^Xs — y'^Xs can be transformed into 

y^Xs-y'^Xs = {Xs + z)'^ {Xs - Xs) 

= {Xs - Xsfz + {Xs - XsfXs 

= {Xs-Xsfz 

- {\\Xs\\^-{XsfXs). (16) 

We thus have 

Prob[y'^Xs > y^Xs] = Prob[a^ z > b], (17) 

where a € M" and 6 e M are given by 

a = Xs-Xs, (18) 

b = WXsW^ ~ {XsfXs. (19) 

The left-hand side of a^z > 6 is a linear combination of Gaussian noises. The mean of z is zero and the variance is 
given by 

Var[a'^z] = a'^\\a\\'^. (20) 

The probability such that the Gaussian random variable z takes a value larger than or equal to b can be expressed as 

Pe{X,X) = Prob[a'^z>b] 

= q(^)- (21) 



Combining the union bound and this pairwise error probability, we immediately obtain the claim of this lemma. ■ 
The upper bound on decoding error probability in Lemma [T] naturally leads to a pseudo distance measure on M"^". 
Definition 6 (Pseudo distance): The function 

\\X s — X s\\ 

is called the pseudo distance where X,X E M"^" are doubly stochastic matrices. q 
Note that Ds{-,-) is not a distance function since it does not satisfy the axioms of distance. In terms of decoding error 
probability, geometry of the vertices of a code polytope should be established based on this pseudo distance. 
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For example, in high SNR regime, the minimum pseudo distance 



A A 

A., = 



min D,{X,X) 
xeTi{A,b,<),xev,x^x 



(23) 



is expected to be highly influential to the decoding error probability. 

Example 3: Suppose the Unear constraint trace(X) = 1 where n = 3. In this case, the code polytope has the foUowing 
5 -vertices: 

/ n 1 n \ 



A 



V 



V 



A 



10 

1 
^001 

^ 1/3 2/3 ^ 

2/3 1/3 
^ 2/3 1/3 J 



A 



/ 1 \ 
1 
10 

/ 1 \ 

10 

1 

^ 1/3 2/3 ^ 

1/3 2/3 . (24) 
2/3 1/3 ^ 

In this case, the set of vertices consists of 3-integral vertices and 2-fractional vertices. Let s = (0, 1, 2)^. The pseudo distance 
distribution form M^^^ is given by 

L',(m(i\m(2)) = 1.388730 
D^(M(i\m(3)) = 1.224745 
£)^(M(i),M('*)) = 1.224745 
D^(M(i\m(^') = 1.224745. 

□ 



B. Upper bound on ML decoding error probability 

Assume the same setting as in the previous subsection. In the case of ML decoding, we can neglect the effect of fractional 
vertices. Therefore, we obtain an upper bound on the ML block error probabiUty 

' \Xs\\^ -{Xs)^Xs\ 



Pml{X) < ^ Q 

xe'n{A,b,<)\{x} 

E Q 

XeIl(A,b,<)\{X} 



a\\Xs-Xs\ 

\Xs-Xs\\\ 
2ct j 



(25) 



based on a similar argument. The above equality holds since \\Xs\\ = \\Xs\\ holds for any X e 11(^4,6, <). Note that this 
simplification cannot apply to X if X is a fractional vertex. This is because the preservation of Euchdean norm does not hold 
in general for a doubly stochastic matrix. For example, we have 

/ 1/3 2/3 \ 



1/3 2/3 
\^ 2/3 1/3 J 



1.9147 9^ ||s|| = V5, 



(26) 



where s = (0,1,2)"^. 
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If n(yl, b, <) have a group structure under the matrix multiplication, the above upper bound can be further simplified as 

Pml< E Q 2a ■ ^^^^ 
xen{A,b,<)\{i} \ / 

It should be remarked that the second upper bound ([27| is independent of the transmitted codeword. In order to prove the 



bound (27 i, it is sufficient to prove Tl{A, 5, <) is distance invariant with respect to the Euclidean distance. 

In the following, the distance invariant property of Il{A, b, <) will be shown. Let us define the Euclidean distance enumerator 

by 

Wx{Z)^ E Zll^-^^ll. (28) 
xen(A.b,<) 

This enumerator has the information on distance distributions measured from the permutation matrix X. 

The next lemma states that the Euclidean distance enumerator does not depend on the center point X if the linearly 
constrained permutation matrices have a group structure. This property can be regarded as a distance invariance property of 
permutation codes. 

Lemma 2 (Distance invariance): If Ti{A, b, <) forms a group under the matrix multiplication over R, the equality 

Wx{Z) = W{Z) (29) 

holds for any X e Il{A,b, <). The weight enumerator W{Z) is defined by W{Z) = Wi{Z) where / is the n x n identity 
matrix. 

Proof: Since Il{A,b, <) forms a group, the inverse X^^ belongs to Il{A,b, <) as well. Since the inverse X^^ induces 
a symbol-wise permutation, it is evident that 

\\Xs -X3\\ = \\X-^Xs - X-^Xs\\ = \\s- X-^Xs\\ (30) 

holds for any X,X ^ n(A, b, ^){X ^ X). The Euclidean distance enumerator can be rewritten as 

Wx{Z) = 

xen(A,b.<) 

= z\\s~X-'Xs\\ 

xen{A,b,<) 

Y Z\\'~^''\\ ^W{Z). (31) 
x'en(A,6,<) 



The second equality is a consequence of Eq. (30 1. The last equality is due to the assumption that n(A, 6, <) forms a group. 



Example 4: We have performed the following computer experiment for the following two codes: 

1) LP decodable permutation code corresponding to the derangements of length 5. The additional linear constraint is 
trace(X) = 0. A transmitted word (1,0,4,2,3)-'^ is assumed. The code polytope has 44-vertices which are all integral 
vertices. 

2) LP decodable permutation code of length 5 corresponding to an additional linear constraint X\,\+X^^^ = 1. A transmitted 
word (0, 4, 3, 2, 1)^ is assumed. The code polytope has 330-vertices. The set of vertices contains 36-integral vertices and 
294-fractional vertices. 

The AWGN channel with noise variance is assumed. The signal-to-noise ratio is defined by SNR — lOlogj^Q (l/c^) . The 
LP decoding described in the previous section was employed for decoding. 
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Figure |2] presents the upper bounds and simulation results on block error probability of these permutation code. It is readily 
observed that the upper bounds presented in this section shows reasonable agreement with the simulation results. 

The both codes have the same minimum pseudo distance 0.707107 and similar cardinalities (44 and 36) but the derangement 
code provides much better block error probabilities than those of the code with the constraint Xi i +X5 5 = 1. This is because 
the existence of fractional vertices (i.e., 294-fractional vertices) severely degrades the decoding performance of the code with 



the constraint X 



1,1 



X5 5 = 1 compared with the derangement code. 

^0 



□ 
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Fig. 2. Comparison of upper bounds and simulation results for LP decoding on block error probabilities (n = 5) 



V. Some classes of linearly constrained permutation codes 
In this section, we will discuss some classes linearly constrained permutation codes which are easy to encode. 

A. Repetition permutation codes 

Let rj be a positive integer Assume that a positive integer n is a multiple of 77. The repetition permutation codes with 
repetition order rj is defined by 

{{{Ysif, {Ys2f. {Ys,,ff G M" I y e n„/ J, (32) 

where si, S2, ■ ■ ■ , s^i € M"/''. We here assume that all the elements in si, . . . , Sj, are distinct each other. It is evident that the 
cardinality of the code is given by (n/r/)!. The minimum Hamming distance of the code is 2ri because the minimum Hamming 
distance of Ysi is 2 for any i E [1,?]]. 

It should be remarked that the repetition permutation code is a linearly constrained permutation code. The next example 
demonstrate linear constraints for the repetition permutation codes. 

Example 5: Let 



X = 



f 




Xl^2 






\ 






^2,2 


^2,3 


X2.4 








^3^2 


-''^3,3 


X3A 






^4,1 


^4,2 




X4A 


J 
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The permutation matrices in 114 satisfying the following set of Unear constraints 

Xi^3 — Xi^4 — X2,3 = ^2,4 = (33) 

^3,1 = ^3,2 = ^4,1 = X4,2 = (34) 

-'^l.l = ^3,3) -^^1,2 = -^^3,4 (35) 

-^^2,1 = ^4,3) -^^2,2 = -^^4,4 (36) 

defines the repetition permutation code of length 4 with repetition order 2. □ 
B. Cartesian product codes 

Suppose that jj is a positive number and that n is positive multiple of rj. A set of permutation matrices U C n„/^ is assumed 
to be given. The cartesian product codes is defined by 

{{{Y^sif, {Y2S2f, {Y^s^ff e M" I Fi, . . . , e U}, (37) 

where si, S2, - ■ ■ G R"/''. The cardinality of cartesian product codes is thus given by \U\'^ if aU the elements in si, . . . , 
are distinct each other. Note that the class of cartesian product codes can be defined based on a set of hnear constraints as 
well if f/ is defined by hnear constraints. 



C. Pure involution codes 

In this subsection, we focus on the set of pure involutions, which produces a non-trivial class of permutation codes. It will 
be shown that the class of the permutation codes defined based on the pure involutions possess several good properties. This 
class of code can be encoded with an efficient greedy encoding algorithm. The cardinaUty of the code is much larger than the 
repetition code with the same length and the same minimum Hamming distance. 

An involution is a permutation which coincides with its inverse permutation. Namely, the necessary and sufficient condition 
for a permutation matrix X e n„ to be an involution is X = X'^ because the inverse matrix of a permutation matrix is the 
transposition of it. A pure involution is an involution without fixed point; i.e., a permutation matrix X e n„ is said to be a 
pure involution if and only if X = X^ and trace(X) = 0. In other words, the set of pure involutions is the intersection of 
the set of involutions and the set of derangements. 

A pure involution exists when n is a positive even number. The reason is as follows. The lower triangle below the diagonal 
of X and the upper triangle above the diagonal must have the same number of ones since X = X'^. This imphes that the 
number of ones in X should be even since the diagonal is constrained to be zero. A permutation matrix X e n„ contains 
n-ones. Thus, if n is odd, it is clear that no permutation matrix meets the constraints. Throughout this subsection, we assume 
that n is an even positive number. 

Let 

ri„ = {X e n„ I X = X^, trace(X) = 0}. 
It is known that the cardinality of the pure involutions is given by 

= (n - l)(n - 3) X ... X 3 X 1 = ■ (38) 
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The linearly constrained permutation codes defined based on the constraints X = X^,trace(X) = is called the pure 
involution codes. The triple for the pure involution codes are given by 

/ vec(7„) \ 
vec 
vec 

\^ vec j 
where f'*'-') g {0, l}"^^" is the binary matrix defined by 



6 = 0, 



(39) 



a,b 



1, 
-1, 

0, 



{a,b) = 
(a, b) = {j, i) 
otherwise. 



1 ) Greedy encoding algorithm for pure involutions: A significant advantage of the pure involutions is that there exists an 
efficient encoding algorithm. The procedure EncMap shown below can be considered as a greedy algorithm for a constraint 
satisfaction problem without a back-tracking process. 
EncMap 

* Input: m e [1, (n - 1) X (n - 3) • • • 3 X 1] (message) 

* Output: X e 0„ (pure involution) 

1) m := m — 1; 

2) for (p := 0; p < n/2; p:=p + l) { 

3) ap := [m mod (2p + 1)] + 1; 

4) m:= m div {2p+ 1); 

5) } 

6) e[l,n],Xij :=0; 

7) Vi,i e [1, n]{i ^ j), Zij := 1; Vi e [1, n], Z^^^ := 0; 

8) for (p := n/2 - 1; p > 0; p := p - 1) { 

9) j := arg min {j' e [1, n] : E"=i ^i' J' > 0}; 

10) i := argmin |A; e [1, n] : ^i'j = flpj; 

11) Xij := 1; := 1; 

12) \/q e [1, n], Zgj := 0, Z^, := 0, Z^,, := 0, Zg^i := 0; 

13) } 

14) Output X; 

The arithmetic operation in the line 4 represents the division for integers; i.e., 5 div 2 = 2. There are some remarks on 
EncMap. The part from the line 1 to 5 converts a message integer into an n/2-tuples: 



(ao, ai, . . . , a„/2_i) G [1, 1] x [1, 3] 



The remaining part generates a pure involution according to the n-tuple (ao, ai, . . . , an/2-1) ■ 

The variables Zij represents whether Xij is determined {Zij =0) or not {Zij = 1). On the diagonal elements of Zij are 
initiaUzed to be zero which means that the diagonal elements of Xij is determined to be zero. 
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The generation of a pure involution is performed in a greedy manner from the left columns to the right columns. The 
undetermined colimm with the smallest index is found in the line 9. In the line 10, the row index of Op-th undetermined 
element is assigned to i. In the hne 11, two ones are written at and {j, i)-positions of X and the Une 12 fixes the cross 
regions aroimd and 

In an encoding process, for any p = N — t{t G [1, N]), 

n 

Zi,^j = (n - 1) - 2(t - 1) = 2p + 1 (40) 

i'e[i,Ti] 

holds at the line 10. This is because exactly two-columns and two-rows of Z are set to zero for each iteration due to the 
constraints of the pure involution. In other words, the numbers of zero columns and zero rows are increased by two after an 
iteration. This property guarantees that 

n 

J2 Zi,j>aj, (41) 

i'e[l,n] 

holds for all p e [0, — 1]. Therefore, for any input m, the Une 10 can find an index i satisfying 

f 1 

i = argmin < A: € [1, n] : Zi>j — ap> . 

[ i'=i J 

The loop from the Une 2 to 5 takes 0(n)-time under the assumption that the basic big-number arithmetics can be done 
within a unit time. The initiaUzation process (Unes 6 and 7) requires 0(n^)-time. The most time consuming part of EncMap 
is the loop from the line 8 to 13. In order to find i,j in Unes 9 and 10, 0(n)-times requires. The process in line 12 also needs 
0(n) -time to carry it out. Therefore, the time complexity of the loop (from the line 8 to 13.) is 0{n^), which dominates the 
time complexity of EncMap. 

From the definition shown above, it is evident that EncMap gives a injection map from [1, (n — 1) x (n — 3) • • • 3 x 1] to 
fin- Since the cardinality of f2„ is (n — 1) x (n — 3) • • • 3 x 1, we can see that EncMap is a bijection. 

There is an inverse map of EncMap from 0„ to [1, (n— 1) x (n — 3) • • • 3 x 1] because EncMap is a bijection. The procedure 
DecMap gives the inverse map of EncMap. 
DecMap 

* Input: X G Cln (pure involution) 

* Output: m e [1, (n - 1) X (n - 3) • • • 3 X 1] (message) 

1) Vi,i e [l,n]{i^j),Zij := 1; Vi e [l,n],Zi,i := 0; 

2) for (p := n/2 - 1; p > 0; p := p - 1) { 

3) j := argmin{/ e [l,n] : ^i'j' > 0}; 

4) i:=j:i'e[i,n]i'nXi'j = l]; 

5) dp ■= Z^i'=i Zi>j; 

6) yq e [1, n], Zgj := 0, Z,- , := 0, Zi,g := 0, Zg,i := 0; 

7) } 

8) m := 0; 

9) for (p := n/2 - 1; p > 1; p := p - 1) { 

10) m := {2p + l)m + {ap - 1); 

11) } 

12) m:=m+l; 
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13) Output m; 

Example 6: An encoding process of a pure involution matrix is illustrated in Fig|3] In this example, n = 6 is assumed. 
The status of Xij and Zij are depicted by 6 x 6 cells in Fig [3] Namely, Zij = 1 (undetermined state) represents an empty 
cell. A cell with label (resp. 1) represents {Xi j,Zi j) = (0,0) (resp. {Xi j,Zi j) = (1,0)). At first, the diagonal cells 
are set to be zero because of the constraint trace(X) — 0. The message is assumed to be rn = 5. In this case, we have 
ao — l,ai ^ 3,02 = 2. The shaded cells in Figj3](a) represents possible places to write the symbol 1. According to the part 
of the message 02 ~ 2, the second shaded cell is determined to be 1. In Fig|3](b), the symbol 1 is written on the symmetric 
position and zeros are placed in the columns and rows corresponding to two I's. In a similar way (Figj3] (b)-(e)), the empty 
cells are filled with or 1. As a result, we have a pure involution matrix (Fig|3](f)). 
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The shaded cells represent are possible places to write the symbol 1. In (a) and (b), there are 5 and 3-shaded cells, respectively. This means that 5 X 3 — 15 pure involution 
matrices exist when n — 6. 

Fig. 3. An Encoding process of a pure involution matrix 

2) Minimum Hamming distance of pure involution codes: Let s S K" be an initial vector whose components are distinct 
each other. It is well known that the minimum Hamming distance of A(s) is given by 



min dH(Xs,X's) ^2. 
x,x'eu„{x^x') 

The minimum Hamming distance of the pure involution codes is larger than that of A(s). 

Lemma 3 (Minimum distance): The minimum Hamming distance of the pure involution codes are given by 



(42) 



min dniXs^X's) = 4. 
x,x'en„{x^x') 



(43) 
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Proof: Assume that X,X' e ri„(X ^ X'). Since X ^ X' , there is an index pair e satisfying 7^ X- j. 

Without loss of generality, we assume that Xi j = 1 and X'^ ^ = 0. An index / e [1,?^] satisfying Xi i ^ X[i must exist 
because X and X' are permutation matrices. Due to the assumption Xi^j — 1 and X[ ^ ~ 0, we have Xn — and X[ ^ — 1. 
In a similar manner, there must be an index k satisfying Xkj ~ 0,X'f.j = 1. It is possible to continue this argument until a 
sequence of index pairs constitutes a loop. 

The set of the index pairs G | Xij ^ j} called a difference position set. The argument above implies that 

the difference position set needs to be partitioned into several loops of even length. A loop means a sequence of adjacent index 
pairs with the form {ii, 12) — (ii, 13) — («4, 13) • • • {ii,i2)- If Xi^ i^ ~ 1 holds, then we have Xi^ i^ = 0, Xi^ i^ — 1 and 
so on. Therefore, the length of a loop should be even because a loop with odd length gives inconsistent assignment Xi^^i.^ = 
at the end of the loop. 

The shortest loop of even length have the form (i, j) — > («, I) — > (fc, /) — > — > {hj)- If the difference potion set includes 
this type of a loop of length 4, it must also contain another loop of length 4 with the form (j, i) {I, i) — )■ (I, k) — {j, k) 

because X = X^ holds for any X,X' e i7„ (See Figji]). Let a = Xs and a' = X's. If the difference position sets 
consist of only such two symmetric loops of length 4, we have 

a„ 7^ a'„ iff M e fc, Z}. 

This implies that the smallest number of differences between Xs and X' s is 4. ■ 
The proof of the above lemma indicates a way to enumerate the number of codewords at the minimum Hamming distance. 
For a fixed Xs, the number of codewords X's satisfying dH{Xs,X' s) = 4 can be obtained by enumerating the number of 
allocations of two symmetric loops. 



k 



J 



I 



The left loop of length 4 represents (z, j) (i, I) (fc, /) — > (k, j) — f (i. j) and the right loop corresponds to {j, —> 2) fc) —> (j. k) — ^ {j, i). Note that 

there are 4-columns which include elements of the difference position set. These columns correspond to the positions on which the symbols of Xs and X' s differ. 



Fig. 4. Two symmetric loops of length 4 in a difference position set. 



We have seen that the repetition code of repetition order 2 yields the minimum Hamming distance 4. When the length of 
the code is n (even), the number of codewords is given by {n/2)l. On the other hand, the pure involution code provides the 
same minimum Hamming distance and the cardinality of the code is given by n!/(2"/^(n/2)!), which is much larger than 
{n/2)\ because 

(-/2)! " U/2r -yw2 ■ ^ ^ 
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For example, consider the case where n = 64. In this case, the number of codewords of the repetition code is (n/2)! ~ 2^^^. 
On the other hand, the pure involution code have 

_ 21^6 

2«/2(n/2)! 

codewords which is approximately 2^^-times larger than that of the repetition code. 

3) Code polytope of pure involutions: The linear constraint X = and trace(X) = for pure involutions defines a code 
polytope which is not an integral polytope. 

Example 7: Assume that n = 6. The code polytope defined based on the constraints X = X'^ and trace(X) = have 15 
integral vertices and 10 fractional vertices. A fractional vertex is 



/ 

1/2 



1/2 \ 
1/2 







1/2 

1/2 1/2 
1/2 1/2 
1/2 1/2 
L/2 1/2 

Deriving inequality description of the convex hull of pure involution matrices is an interesting open problem. 

4) Simulation results: The minimum Hamming distance of a permutation code is a universal measure for goodness of a 
code because it does not depend on the choice of the initial vector s. However, as we have seen in the previous section, 
decoding performance is mostly determined by the pseudo distance distribution of a code polytope. 

In order to evaluate the decoding performance of pure involution codes, we have performed a computer experiment. Figure 
[5] presents the block error probability of the pure involution codes with length 64. In this experiment, the initial vector is 
assumed to be s = (1,2,..., 64) and the LP decoding was used. The definition of the SNR is the same as in Example |4] For 
comparison purpose, the block error probabilities of the repetition permutation code of length 64 with the repetition order 2 
and uncoded permutations vectors (i.e., A(s)) of length 64 are also plotted in Fig. |5] It can be observed that the pure involution 
code gives much small block probabilities compared with the repetition code. As we have seen in the previous section, the 
cardinality of a pure involution code is much larger than that of the repetition code. We may be able to conclude that the pure 
involution code is superior to the repetition code. 



D. Block permutation codes 

A block permutation codes are defined based on the block permutation matrices. The block structure is useful for encoding 
and evaluation of the minimum squared Euclidean distance. 

1} Definitions: Suppose the situation where the set [l,n] x [l,n] is divided into mutually disjoint 7x7 square blocks of 
size V X V (i.e., n = holds). The square blocks are called blocks which is explicitly defined as follows. 

Definition 7 (Block): For fc, 6 G [1,7], a block Bk^b is defined by 



A 



Bk,b^{iiJ) e [l,nY \iy{k-l)<i< vk,u{h-l)<j < vb}. 



The indices k and h are called block indices. 
The rectangle region T^l is defined as 



(45) 

□ 



(46) 
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Fig. 5. Comparison of block error probabilities: pure involution codes, repetition permutation codes, and uncoded permutation vectors of length 64 

for fc, 6 e [1, 7] and I e [1, v]. The subscript k, b specifies the block where the rectangle region T^'^ belongs to. The superscript 
I E [1, i^], which is called a subindex, indicates the relative position in the block B^ ij. 

We are now ready to define a block permutation matrix which is the basis for realizing a block-wise permutation group. 

Definition 8 (Block permutation matrix): Assume that a permutation matrix X E n„ is given. If, for any b E [1,7], there 
exists the unique block index k satisfying 

X{Bk,b) + (47) 

then X is called a block permutation matrix. The notation X{Bk.b) represents the sub-matrix of X corresponding to the block 

Bk,b- □ 
From this definition, it is apparent that a nonzero X{Bk b) G {0, 1}''^'' is a permutation matrix if X is a block permutation 
matrix. Furthermore, there exists the unique block index b satisfying X{Bk.b) 7^ for any block index k E [1,7]- This 
equivalent statement can be obtained by exchanging the role of column and row in the above definition. 
For block indices fc, 6 e [1,7] and subindex I E [1, i^], the skewed column set is defined by 



(0 ^ 7^(0 III II mod + l 

k,b ~ ^k,b ^ \ U k',b 

fc'e[l,7]\{/c} 



(48) 



Figure |6] illustrates the subsets of [l,n\ x [l,n] appeared so far such as the blocks, the rectangle regions, and the skewed 
column set. 

2) Block permutation codes: The next theorem presents a set of linear constraints characterizing block permutation matrices. 



Theorem 3 (Characterization of block permutation matrix): Let X E n„ be a permutation matrix. The permutation matirx 
X is a block permutation matrix if and only if 



holds for any b,k E [1, 7], ^ G [1, i^]. 
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Fig. 6. Blocks, rectangle regions and skewed column set 



The next example clarifies the linear constraints characterizing a 4 x 4 block permutation matrix. 

Example 8: Let n = 4,1^ — 2,j ^ 2. The necessary and sufficient condition for a permutation matrix X G 114 being a 
block permutation matrix are as follows: 



-'^1,3 



^2,1 

^2,3 
- X2A 



X. 



3,2 



X 



4,2 



^3,1 + X/^,i 



^3,4 - 
^3,3 



^4,4 
^4,3 



Let us denote the set of block permutation matrices by 



A 



n{n,v) = {X eIVn\ X satisfies (49)}. 



(50) 



(51) 



Note that we here employ a lighter notation Ii{n, v) instead of n(A, 6, <) since it explicitly express dependency on n and v. 
It should be remarked that n(n, v) forms a group under matrix multiplication over M. 

The class of block permutation codes defined below is a class of LP decodable permutation codes. 

Definition 9 (Block permutation code): Let n be a positive integer A positive integer 1/ is a divisor of n. The initial vector 
s belongs to M". The block permutation code C{n, v, s) is defined by 



A 



C(n, V, s) = {Xs e M" : X e n(n, v)}. 



(52) 



□ 

In Section |IV[ we saw the minimum pseudo distance is one of most influential parameters for LP decoding performance. 
Unfortunately, the evaluation of the minimum pseudo distance is not a trivial problem. As a possible alternative, we here 
evaluate the minimum squared Euclidean distance of C(7i, s) defined by 

J2 ^ ^,.2 



x,y^C{n,i/,s)[x^y) 



(53) 
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At least, we can say that decoding performance degrades even with an ML decoder if C{n, v, s) has small 

The block-wise permutation structure of a block permutation code can be exploited for deriving a simple formula on the 

minimum squared Euclidean distance. 
Let us define Af and A\ by 

= mill mill ||sA;-Qsfc|P 
fce[i,7] Qen„(Q#/) 

Ao = min min Hsi, — Qsi|P. (54) 

fejG[i,7](fc/j)Qen„ " ■"' 

Assume that both A^ and A| are positive for given n, s. In such a case, C{n, i/, s) is non-singular and it is easily proved 
that the minimum squared Euclidean distance of C{n, ly, s) is given by 



dL„ = miii{A?,2A2}. 



(55) 



The following example illustrates that a block permutation code can have more codewords than those of a trivial cartesian 
product code under the condition that both of codes have the same minimum squared Euclidean distance. 
Example 9: Let n = 8, 7 = 2, i/ — 4. The initial vector s — (s^, s'^)'^ is assumed to be 



si = 



3 

5 



S2 = 



From the definition of A^jAj, we easily obtain A^ = 8, A^ = 4. From (55 1, we have d^i„ = min{8, 2 x 4} = 8. The 



number of codewords is 7! x {i^iy — 1152. The cartesian product code defined by 

has also squared Euclidean distance 8 but it contains 576-codewords, which is half of the number of codewords of the block 
permutation code. □ 

VI. Randomly constrained permutation matrices 

In the previous section, we discussed a set of structured permutation matrices. Another possible choice for linear constraints 
is to generate them randomly. Such random linear constraints are amenable for probabiUstic analysis and appears interesting 
from information theoretic view. In this section, we study a class of LP decodable permutation codes defined based on random 
constraints. 



A. Sparse constraint matrix ensemble 

Since the LP decodable permutation codes are non-linear codes, the cardinality of a given code cannot be determined directly 
from the constraints in general. In the following part of this section, we will analyze the cardinality of codes and their Hamming 
weight distributions. 

A sparse constraint matrix ensemble is assumed in the following analysis, which has a close relationship to the analysis on 
average weight distribution of LDPC ensembles [12]. 

The linear constraint assumed here is the equality constraint for two variables such as j — Xkj- As discussed in Section 
X, linearly constrained permutation matrices defined based on this equality constraint is important because such matrices can 
be used as building blocks of a generalized block permutation code. 
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Let S be the set of binary constraint matrices: 

S={Ae {0, : every row of A contains 2-ones}. (56) 
We assign the uniform probability 

P{A) t ^ (57) 

I 2 / 

to each matrix in S. The pair {S, P) can be considered as an ensemble of matrices, which becomes the basis of the following 
probabilistic method. 

Assume that 6 : S ^ l}'»x"' is defined hy B = 6{A), where 

5,, = /-^^- ifY/e[i,,-i],A.=o, ^^^^ 

I ^ij, otherwise. 
Note that 9{A)y&c{X) = corresponds to m equality constraints of two variables. 

In this section, we focus on the LP decodable permutation code K{9{A), 0, <, s), where A & S and < = ('=, = '^. . , =y . 
The symbol 1 denotes the vector of length m whose entries are all ones. Extensions of the analysis for more general classes 
of LP decodable permutation codes are possible, but we here focus on the simplest class to explain the idea of the analysis. 
Throughout this section, we assimie that components of the initial vector s differ each other. 

B. Probabilistic analysis on average cardinality of codes 
The number of codewords in A(^(A),0, <,s) is given by 

M{A) = J2 vec(X) < 0], (59) 

where I is the indicator function. The indicator function takes the value one when the given condition is true and otherwise 
gives the value zero. The next lemma gives the average cardinality of this code. 
Lemma 4 (Average cardinality of codes): The average cardinality of h.{9{A), 0, <, s) is given by 

^^^^p^J . (60) 

where the operator E denotes the expectation defined on {S,P). 

Proof: From the definition of M{A), the expectation of the cardinality M{A) can be written as 



E[M{A)] = ^P(A)M(^) 

Aes 

= ^P(A) ^ vec(X)<o]. (61) 



Aes xen„ 

By changing the order of summation, we can further transform this into 

E[M{A)] = Yl E^(^M^(^)v«'=W^O] 

= 7^El[^(^) vec(X')<0], (62) 

\r) A^S 

where X' is an arbitrary permutation matrix in n„. The last equality is due to the symmetry of the ensemble. Namely, this means 
that the quantity Y^^^s ^[^(^) vec(X') <0] does not depend on the choice of X' . The evaluation of Y^a^s ^[^(^) vec(X') <0] 
can be performed on the basis of the following combinatorial argument. 
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A 



It is evident that any X' G n„ contains n-ones as its components. This implies that x' = vec(X') is a binary vector of 

I x[ = 1}, where x[ is the zth element of x' . Consider the first row 



A 



length in? with Hamming weight n. Let Ii = {i G [1, n 
of A, which is denoted by . The relation 9{a^)x' = holds if and only if 

\{i G /i I a. = 1}| = 2 or \{i G [1, n^]\h \ a, = 1}| = 2. 

The number of possible ways to choose such a vector a is given by 



(63) 



n ~ n 
2 



(64) 



The term (2) corresponds to the number of possible ways such that Ii (of cardinality n) contains 2-ones. On the other hand, 
(" 2^") represents the number of possible ways that remaining parts contains 2-ones. Since each row of A can be chosen 
independently, we consequently have 



^ I[e{A) vec(X') < 0] = 



AGS 



n — n 
2 



(65) 



Substituting ( 65 1 into ( [62] i, we immediately obtain the claim of the lemma. ■ 
Example 10: In this experiment, the number of 10 x 10 permutation matrices satisfying randomly generated equality 
constraints of two variables was counted. Figure [7] plots the cardinality of 100-samples for the cases where m — 30,40,50. 



The figure includes the ensemble average of the cardinality given by ( 60 1 and the sample mean of the cardinality. The figure 
shows that cardinalities are scattered around the ensemble average and that the sample mean agree with the ensemble average 
with reasonable accuracy. 



This figure shows a trade-off relation between the number of additional equalities m and the cardinality. As ( 60 1 indicates, 
the average cardinality is an exponentially decreasing function of m. q 
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Fig. 7. Relation between additional equalities m and average cardinality 
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C. Probabilistic analysis on weight distribution 

The origin o = (oi, . . . , o„) is an arbitrary permutation vector of length n; namely, o G A(s). The number of codewords of 
A (6* (A), 0, <, s) with Hamming weight w is denoted by where the Hamming weight wh{-) is defined by 



wh{x) = ^I[oi 7^ a;^], 



(66) 



where x = (xi, . . . , a;„). This means the Hamming weight of x is equal to the Hamming distance between the origin and x. 
In other words, Lw{A) is defined as 

A 



L„(v4) = ^ 1[wh{x)=w]. 

xeHB(A),0,<,s) 

The set {Li{A), . . . , L„(A)} is referred to as the weight distribution of K{6{A), 0, <, s). 
The next lemma gives the ensemble average of the weight distribution. 

Lemma 5: The average weight distribution of the linearly constrained permutation code A{0{A), 0, <, s) is given by 



(67) 



w\ + l 



("V")' 



Proof: The weight distribution Lu,{A) can also be expressed as 



CD 



where Z^{o) is defined by 



A 



Z^{o) = {X e n„ : wniXs) = w}. 
The expectation can be simplified as follows: 

E[L^{A)] = E^(^) E vec(X) < 0] 

AGS XeZ„(o) 



(68) 



(69) 



(70) 



xez„{o) Aes 



(2) + (" 2I 



|^»(o)|. 



(71) 



The last equality is due to the symmetry of the ensemble and (65 1 



The cardinality of Z^io) is given by the following combinatorial argument. Let x G A(s) be an arbitrary vector satisfying 
wh{x) = w. The index set Idiff is defined by Idiff{x) = {i <E [1, n] \ Oi ^ Xi\. Let T C [1, n] be an index set of cardinality 
w. The quantity \{x £ A(s) | T = Idiff is equal to the number of derangements of length w, which is known to be 
[(w! + l)/ej [33]. Note that the number of possible ways to choose T is ("). Thus, we have the equality 



C) 






e 



\ZUo)\ 

This completes the proof of the lemma. 

Note that the origin assumed here may not be included in A{6{A), 0, <, s). 



(72) 
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VII. Conclusion 

In this paper, a novel class of permutation codes, LP decodable permutation codes, is introduced. The LP decodable property 
is the main feature of this class of permutation codes. 

The set of doubly stochastic matrices, i.e., the Birkhoff polytope, have nl integral vertices which are permutation matrices. 
Additional linear constraints defines a code polytope which plays a fundamental role in the coding scheme presented in this 
paper An LP decodable permutation code is the set of integral vertices of a code polytope. 

In an LP decoding process, a certain linear objective function is maximized under the assumption that the feasible set is a 
code polytope. The decoding performance can be evaluated from geometrical properties of a code polytope. 

The choice of additional linear constraints are crucial to construct good codes. In this paper, two approaches are discussed; 
namely, structured permutation matrices and randomly constrained permutation matrices. 

Section [V] introduces some classes of structured linearly permutation matrices. Especially, it has been shown that the pure 
involution codes have several nice properties; they are easy to encode and their error correction performance is much better 
than the trivial repetition code. 

The random constraints discussed in Section [Vl] enable us to use probabilistic methods for analyzing some properties of 
codes. The probabilistic methods [26] are very powerful tool for grasping the relation between the number of constraints and 
important code parameters such as the cardinality of a code. 

Although the paper provides fundamental aspects of the LP decodable permutation codes, a number of problems remain still 
open. The following list is a part of open problems. 

1) Construction of good block permutation codes including a choice of an initial vector 

2) Efficient algorithm for solving the LP problem arising in the LP decoding. 

3) Permutation modulation for linear vector channels; let H he a n x n real matrix. An ML decoding problem for a linear 
vector channel can be formulated as 

minimize ||y — ffa;|p subject to x e A(A, b, <, s). (73) 

As discussed in this paper, the decoding problem can be relaxed to a quadratic programming (QP) problem: 

minimize \\y - HxW^ subject to x e ^(A, 6, <, s). (74) 

A QP-based decoding algorithm like [31] appears interesting for this problem. 

4) An application to rank modulation 

Further investigation on related topics may open an interesting interdisciplinary research field among coding and combinatorial 
optimization. 

Appendix 

1 ) Code polytopes for some classes of linearly constrained permutation matrices: Table [ijpresents linear constraints for some 
sets of permutation matrices and their integrality of corresponding code polytopes. In this table, it is assumed that X e M*^"'. 
The integrality is numerically checked with the vertex enumeration program cdd based on double description method by K. 
Fukuda [32]. 

Some remarks on Table U are listed as follows. 
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TABLE I 

Code polytopes and its properties (n = 4) 



set of perm, matrices additional constraints integrality |V| 



cyclic perm. mat. 










Y 


4 


derangement 


trace(X) = 


= 






Y 


9 


involution 


X = XT 








N 


14 


transposition (1) 


trace(X) = 


: n 


- 2 




N 


20 


transposition (2) 


trace(X) = 


- n 


- 2 




Y 


6 




X = XT 












2x2 block 


constraints 


50 






N 


28 


2x2 block 


constraints 


50 


and 




Y 


8 



The column of integrality (Y/N) represents the code polytope is integral (Y) or not (N). The column denotes the number of vertices on the code polytope. 



1) Cyclic permutation matrices The cyclic permutation matrices of order 4 is given by the following additional linear 
constraints: 



^1,1 


= X2,2i 


X2,2 


= -^^3,3, 


-^^3,3 


— X4^4 


X2,l 


= Xs,2, 


X3.2 


= X4,3, 


X4,3 


= Xi,4 




= Xi27 


X4,2 


— Xi^3, 


-^^1,3 


— X24 




~ Xl,2; 


Xl^2 


~ X2,3, 


X2,3 


— X'j,_4. 



(75) 

In a similar way as in the case n — 4, we can define the cyclic permutation matrices of order n. The general expression 
the constraint for n x n cyclic permutation matrices is given by 

yi,j G [l,n], = 

mod n) + l,(j mod n) + l- 

(76) 

2) Transposition: The permutation matrices satisfying the linear constraint trace(X) = n — 2 exactly coincides with the set 
of transpositions (i.e., permutations of two elements). Note that the constraint trace(X) = n — 2 does not give the tight 
polytope. Combining a redundant constraint X — (i.e., the involution constraint) to the trace constraint, the relaxed 
polytope becomes tight. This example indicates that redundant constraints are necessary for constructing a tight polytope 
in some cases. 

3) Block constraint: The linear constraints for block permutation matrices (50 1 introduced in Theorem [3] does not give the 



tight polytope in n = 4. However, combining (50 1 and a set of redundant constraints (i.e., 90 degree rotation of (50i) 



Xii + X12 + X2.3 + X2,4 = 1 

-^^2,1 + ^2.2 + -^1.3 + -^1,4 = 1 

^3,1 + -^^3,2 + -'^4,3 + -'^4,4 = 1 

X4,i + X4^2 + X^^i + ^3^4 — 1, (77) 

we have the convex hull of 2 x 2 block permutation matrices. This case also shows importance of redundant constraints 
from the optimization perspective. From this result, it is expected that the LP decoding performance of block permutation 
codes might be improved by incorporating these redundant linear equalities. 
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Proof of Theorem |5] 

Proof: In the first part of the proof, we will show that any block permutation matrix satisfies (|49]l. 



Assume that fc, e [1, 7] and I G [1, v\ are arbitrary chosen. From the definition of the skewed column set u'^\, the left-hand 



side of d49ll can be rewritten as 



k'€[l,j]\{k} \(u,^)gt(; 7" -' + 1 



(78) 



Recall that X is assumed to be a block permutation matrix. This means that there exists a unique block index kg [1,7] 
satisfying X{Bi^^b) 7^ for given block index b, and the sub-matrix X{B^^i,) is a permutation matrix. If fe = k holds, then 



(79) 



holds. Otherwise (i.e., k 7^ k), the equality 



(80) 



(1 mod + l 



holds. Thus, it has been proved that (49i holds if X is a block permutation matrix. 



We then move to the opposite direction; i.e., (49i implies that X is a block permutation matrix. 

Assume that a block index b E [1,7] and a subindex / G [1, t^] are arbitrary chosen. Let j = v{b — 1) + Z. Since X is a 
permutation matrix, there exists the unique row index i G [1,?^] satisfying Xtj = 1. The block Bk.b containing the set of 
indices {i,j) is uniquely determined because the blocks are mutually disjoint. Under this setting, it is clear that X{Bk,b) 7^ 
holds. 

In the following, we will show that 

k' ^k^ X{Bk'.b)=Q. (81) 



From the definition of the block index k. It is clear that 



E] ^u,v — 1 



(82) 



holds. Combining Eq. (78 1 and Eq. (82i, we immediately obtain 

/ 

E E 

\ ^ ' k' ,b 



This equality implies that 



Xu,v — 0. 



=^ Xu,v — 0. 



fe'e[i,7]\{fc} 



Because X is a permutation matrix. 



E 



1 



(83) 



(84) 



(85) 
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should be satisfied. Applying the same argument iteratively, we consequently have 

{u,v)e y y r^Q^x„,, = 0. (86) 

k'e[i,j]\{k} i'e[hu] 

This statement is equivalent to k' ^ k => X{Bkifi) = 0. Due to the definition of the block permutation matrix, it has been 
proved that X should be a block permutation matrix. ■ 
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